15 research outputs found

    Automated Generation of Cross-Domain Analogies via Evolutionary Computation

    Full text link
    Analogy plays an important role in creativity, and is extensively used in science as well as art. In this paper we introduce a technique for the automated generation of cross-domain analogies based on a novel evolutionary algorithm (EA). Unlike existing work in computational analogy-making restricted to creating analogies between two given cases, our approach, for a given case, is capable of creating an analogy along with the novel analogous case itself. Our algorithm is based on the concept of "memes", which are units of culture, or knowledge, undergoing variation and selection under a fitness measure, and represents evolving pieces of knowledge as semantic networks. Using a fitness function based on Gentner's structure mapping theory of analogies, we demonstrate the feasibility of spontaneously generating semantic networks that are analogous to a given base network.Comment: Conference submission, International Conference on Computational Creativity 2012 (8 pages, 6 figures

    FNet: Mixing Tokens with Fourier Transforms

    Full text link
    We show that Transformer encoder architectures can be massively sped up, with limited accuracy costs, by replacing the self-attention sublayers with simple linear transformations that "mix" input tokens. These linear transformations, along with standard nonlinearities in feed-forward layers, prove competent at modeling semantic relationships in several text classification tasks. Most surprisingly, we find that replacing the self-attention sublayer in a Transformer encoder with a standard, unparameterized Fourier Transform achieves 92-97% of the accuracy of BERT counterparts on the GLUE benchmark, but trains nearly seven times faster on GPUs and twice as fast on TPUs. The resulting model, FNet, also scales very efficiently to long inputs. Specifically, when compared to the "efficient" Transformers on the Long Range Arena benchmark, FNet matches the accuracy of the most accurate models, but is faster than the fastest models across all sequence lengths on GPUs (and across relatively shorter lengths on TPUs). Finally, FNet has a light memory footprint and is particularly efficient at smaller model sizes: for a fixed speed and accuracy budget, small FNet models outperform Transformer counterparts

    LongT5: Efficient Text-To-Text Transformer for Long Sequences

    Full text link
    Recent work has shown that either (1) increasing the input length or (2) increasing model size can improve the performance of Transformer-based neural models. In this paper, we present a new model, called LongT5, with which we explore the effects of scaling both the input length and model size at the same time. Specifically, we integrated attention ideas from long-input transformers (ETC), and adopted pre-training strategies from summarization pre-training (PEGASUS) into the scalable T5 architecture. The result is a new attention mechanism we call {\em Transient Global} (TGlobal), which mimics ETC's local/global attention mechanism, but without requiring additional side-inputs. We are able to achieve state-of-the-art results on several summarization tasks and outperform the original T5 models on question answering tasks.Comment: Accepted in NAACL 202

    Functional Interpolation for Relative Positions Improves Long Context Transformers

    Full text link
    Preventing the performance decay of Transformers on inputs longer than those used for training has been an important challenge in extending the context length of these models. Though the Transformer architecture has fundamentally no limits on the input sequence lengths it can process, the choice of position encoding used during training can limit the performance of these models on longer inputs. We propose a novel functional relative position encoding with progressive interpolation, FIRE, to improve Transformer generalization to longer contexts. We theoretically prove that this can represent some of the popular relative position encodings, such as T5's RPE, Alibi, and Kerple. We next empirically show that FIRE models have better generalization to longer contexts on both zero-shot language modeling and long text benchmarks

    Generating Maps Using Markov Chains

    No full text
    In this paper we outline a method of procedurally generating maps using Markov Chains. Our method attempts to learn what makes a "good" map from a set of given human-authored maps, and then uses those learned patterns to generate new maps. We present an empirical evaluation using the game "Super Mario Bros.," showing encouraging results

    Story Representation In Analogy-Based Story Generation In Riu

    No full text
    Computational analogy offers a promising direction to algorithmically generating stories, a key challenge in computational narrative. Since analogy methods are very sensitive to the story representation being used, this paper focuses on story representation for analogy-based story generation. Specifically, we analyze existing story representation formalisms and propose a new approach based on the cognitive semantics theory of force dynamics. Finally, we present the results of our analogy-based interactive narrative system, Riu, to illustrate the utility of our proposal. © 2010 IEEE
    corecore